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^ Enhancin g software reliability with speculative threads 
Jeffrey Opiinger, Monica S. Lam 

October 2002 Proceedings of the 10th international conference on Architectural 

support for programming languages and operating systems, volume 37 , 36 , 

30 Issue 10 , 5 , 5 

Full text available: ^ pdf(1.47 MB) Additional Infonnation: full citation , abstract , references , citings 

This paper advocates the use of a nnonitor-and-recover programnning paradignn to enhance 
the reliability of software, and proposes an architectural design that allows software and 
hardware to cooperate in making this paradigm more efficient and easier to program. We 
propose that programmers write monitoring functions assuming simple sequential execution 
semantics. Our architecture speeds up the computation by executing the monitoring 
functions speculatively in parallel with the main computation. For ... 



2 A resource management framework for priority-based physical-mennory allocation 
Kingsley Cheung, Gemot Heiser 

January 2002 Australian Computer Science Communications , Proceedings of the 
seventh Asia-Pacific conference on Computer systems architecture - 

Volume 6, volume 24 Issue 3 
Full text available: ^ pdf(1.32 MB) Additional Information: full citation , abstract , references , index terms 

Most multitasking operating systems support scheduling priorities in order to ensure that 
processor time is allocated to important or time-critical processes in preference to less 
important ones. Ideally this would prevent a low-priority process from slowing the 
execution of a high-priority one. In practice, strict prioritisation is undermined by a lack of 
suitable allocation policy for resources other than CPU time. For example, a low priority 
process may degrade the execution speed of a high-p ... 

^ Distributed systems - programming and management: On remote procedure call 
Patricia Gomes Scares 

November 1992 Proceedings of the 1992 conference of the Centre for Advanced Studies 
on Collaborative research - Volume 2 

Full text available: ^ pdf(4.52 MB) Additional Infonnation: full citation , abstract , references , citings 

The Remote Procedure Call (RPC) paradigm is reviewed. The concept is described, along 
with the backbone structure of the mechanisms that support it. An overview of works in 
supporting these mechanisms is discussed. Extensions to the paradigm that have been 
proposed to enlarge its suitability, are studied. The main contributions of this paper are a 
standard view and classification of RPC mechanisms according to different perspectives, and 
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a snapshot of the paradigm in use today and of goals for t ... 

* Progrannming languages for distributed computing systems | 
Henri E. Bal, Jennifer G. Steiner, Andrew S. Tanenbaum 
September 1989 ACM C mputing Surveys (CSUR), volume 21 issue 3 

Full text available- fii| pdf(6.50 MB) Additional Information: full citation , abstract , references , citings , index 
. terms , review 

When distributed systems first appeared, they were programmed In traditional sequential 
languages, usually with the addition of a few library procedures for sending and receiving 
messages. As distributed applications became more commonplace and more sophisticated, 
this ad hoc approach became less satisfactory. Researchers all over the world began 
designing new programming languages specifically for implementing distributed 
applications. These languages and their history, their underlying pr ... 

5 Im plementation of Argus | 
B. Liskov, D. Curtis, P. Johnson, R. Scheifer 

Novennber 1987 ACM SIGOPS Operating Systems Review , Proceedings of the eleventh 
ACM Symposium on Operating systems principles, volume 21 issue 5 

Full text available* 1^ odfd 34 MB) Additional Infonmation: full citation, abstract, references, citings , index 
* ^ ^ terms 

Argus is a programming language and system developed to support the construction and 
execution of distributed programs. This paper describes the implementation of Argus, with 
particular emphasis on the way we implement atomic actions, because this is where Argus 
differs most from other implemented systems. The paper also discusses the performance of 
Argus. The cost of actions is quite reasonable, indicating that action systems like Argus are 
practical. 

® Fast detection of communication patterns in distributed executions | 
Thomas Kunz, Michiel F. H. Seuren 

Novennber 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation . abstract .^ references . index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

7 A polylog time wait-free construction for closed objects | 
Tushar Deepak Chandra, Prasad Jayanti, King Tan 

June 1998 Proceedings of the seventeenth annual ACM symposium on Principles of 
distributed computing 

Full text available: ^ pdf(1.34 MB) Additional Information: full citation , references , citings , index terms 



Partitioned g arbage collection of a large object store 
Umesh Maheshwari, Barbara Liskov 

June 1997 ACM SIGMOD Rec rd , Proceedings f the 1997 ACM SIGMOD internati nal 

conference n Management f data, volume 26 issue 2 
Full text available- 1^ pdf (1 37 MB) Additional Infomiation: full citation , abstract , references , citings , index 
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We present new techniques for efficient garbage collection in a large persistent object store. 
The store is divided into partitions that are collected Independently using Information about 
inter-partition references. This information Is maintained on disk so that It can be recovered 
after a crash. We use new techniques to organize and update this Information while 
avoiding disk accesses. We also present a new global marking scheme to collect cyclic 
garbage across partitions. Global marking ... 

Keywords: cyclic garbage, garbage collection, object database, partitions 



9 GUM: a portable parallel implementation of Haskell 

p. W. Trinder, K. Hammond, J. S. Mattson, A. S. Partridge, S. L. Peyton Jones 

May 1996 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 1996 conference 

on Programming language design and Implementation, volume 3i issue 5 
Full text available- lT^ pdf(1.14 MB) Additional Information: full citation , abstract , references , citings, index 

GUM Is a portable, parallel Implementation of the Haskell functional language. Despite 
sustained research Interest In parallel functional programming, GUM is one of the first such 
systems to be made publicly available.GUM Is message-based, and portability Is facilitated 
by using the PVM communications harness that Is available on many multi-processors. As a 
result, GUM is available for both shared-memory (Sun SPARCserver multiprocessors) and 
distributed-memory (networks of workstations) architec ... 

^0 Providing high availability using lazy replication 

Rivka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat 

November 1992 ACM Transactions on Computer Systems (TOCS), volume lo issue 4 

Full text available: fij pdf(2.46 MB) Additional Information: full citation , abstract , references , citings, index 
^ terms , review 

To provide high availability for services such as mall or bulletin boards, data must be 
replicated. One way to guarantee consistency of replicated data is to force service 
operations to occur In the same order at all sites, but this approach Is expensive. For some 
applications a weaker causal operation order can preserve consistency while providing 
better performance. This paper describes a new way of Implementing causal operations. 
Our technique also supports two other kinds of operations: ... 

Keywords: client/server architecture, fault tolerance, group communication, high 
availability, operation ordering, replication, scalability, semantics of application 

Transactional memory: architectural support for lock-free data structures 
Maurice Herlihy, J. Eliot B. Moss 

May 1993 ACM SIGARCH Computer Architecture News , Proceedings of tlie 20th 

annual international symposium on Computer architecture, volume 21 issue 2 

Full text available: 1 g|pdf(1.10 MB) Additional Information: full citation , abstract , references , citings , index 
• l£3^ terms 

A shared data structure is lock-free if its operations do not require mutual exclusion. If one 
process is interrupted in the middle of an operation, other processes will not be prevented 
from operating on that object. In highly concurrent systems, lock-free data structures avoid 
common problems associated with conventional locking techniques, Including priority 
inversion, convoying, and difficulty of avoiding deadlock. This paper introduces 
transactional memory 

12 Understanding the limitations of causally and totally ordered communication 
David R. Cheriton, Dale Skeen 

December 1993 ACM SIGOPS Operating Systems Review , Pr ceedings f the 
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f urteenth ACM symp slum n Operating systems principles, volume 27 

Issue 5 

Full text available* 153 Ddfd 71 MB) Additional Information: full citation , abstract , references , citings , index 

tenms 

Causally and totally ordered comnnunication support (CATOCS) has been proposed as 
important to provide as part of the basic building blocks for constructing reliable distributed 
systems. In this paper, we identify four major limitations to CATOCS, investigate the 
applicability of CATOCS to several classes of distributed applications in light of these 
limitations, and the potential Impact of these facilities on communication scalability and 
robustness. From this investigation, we find limited meri ... 

13 Fault tolerance under UNIX 

Anita Borg, Wolfgang Blau, Wolfgang Graetsch, Ferdinand Herrmann, Wolfgang Oberle 
January 1989 ACM Transactions on Computer Systems (TOGS), volume 7 issue i 

Full text available: j^l pdfd.QZ IVIB) Additional Information: full citation, abstract , references , citings , index 
^ terms , review 

The initial design for a distributed, fault-tolerant version of UNIX based on three-way 
atomic message transmission was presented in an earlier paper [3]. The implementation 
effort then moved from Auragen Systemsl to Nixdorf Computer where it was completed. 
This paper describes the working system, now known as the TARGON/32. The original 
design left open questions in at least two areas: fault tolerance for server processes and 
recovery after a crash were brie ... 

14 Industry/government track papers: TiVo: nnaking show recommendations using a 
distributed collaborative filtering architecture 

Kamal All, Wijnand van Stam 

August 2004 Proceedings of the 2004 ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(810.92 KB) Additional Information: full citation , abstract , references , index terms 

We describe the TiVo television show collaborative recommendation system which has been 
fielded in over one million TiVo clients for four years. Over this install base, TiVo currently 
has approximately ICQ million ratings by users over approximately 30,000 distinct TV 
shows and movies. TiVo uses an item-item (show to show) form of collaborative filtering 
which obviates the need to keep any persistent memory of each user's viewing preferences 
at the TiVo server. Taking advantage of TiVo's client- ... 

Keywords: clustering clickstreams, collaborative-filtering 
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Turing, a new general purpose programming language, is designed to have Basic's clean 
Interactive syntax, Pascal's elegance, and Cs flexibility. 

^6 File and storage systems: The Google file system 
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung 

October 2003 Proceedings f the nineteenth ACM symposium n Operating systems 
principles 
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Abstract 

The orjihan deleciion and eliminaiion issue has been ad- 
dressed exiensively in the distributed transaction man- 
ageinenf context and more recently in ike distributed 
sysfems context within the RPC framework. An or- 
phan in a distributed object-oriented applications envi- 
ronment is an object which is civaied by an application 
and continues to exist beyond the application's lifetime. 
This paper addresses orphan detection and elimination 
for applications in a distributed object-oriented system. 
We apply our model for a distributed Unix/C-h-h envi- 
ronment, but the model is general and can be applied to 
other distributed object-oriented systems. 



1 Introduction 



A clistribut.ed system consists of multiple computers 
(called hosts) that communicate through a network. A 
distributed application is one whose components reside 
and execute at multiple hosts in a distributed system. In 
a distributed object-oriented application environment, 
we envision the object-oriented programming language 
to be the view of tlie application programmer in a pos- 
sibly heterogeneous distributed object system. 

An orphan in a distributed object-oriented applica- 
tion environment is a non-persistent object which is 
created by an application and continues to exist be- 
yond the application's lifetime. The orphan detec- 
tion and elimination issue has been addressed exten- 



sively in the distributed transaction management con- 
text [Herlihy 87, HerUhy 89, Liskov 87, McKendry 86, 
Mueller 83, Walker 84] and more recently in the dis- 
tributed systems context witliin the RPC frame- 
work [Birrell 83, Casey 86, Lampson 81, Panzieri 88, 
Ravindran 89, Shrivastava 83]. 

A widely accepted approach in constructing reli- 
able distributed programs is the use of atomic actions 
(also called transactions). Atomicity encompasses three 
properties: serializ ability, recover ability, and perma- 
nence of effect. Serializability [Gray 78, Bernstein 81, 
Papadimitriou 79] means that the execution of one ac- 
tivity never appears to overlap or contain the executioji 
of another activity. Recoverability means that the over- 
all effect of an activity is all-or-nothing, and permanence 
of effect means that once an action commited its effect 
will survive failures [Bernstein 83, Attar 84, Gray 81]. 

In a distributed transaction system, an orphan can be 
created because of crashes or aborted transactions. Or- 
phans in such an environment are undesirable because 
they are an activity executing on behalf of an aborted 
transaction, and they can introduce unnecessary delay 
or deadlock by holding locks needed by nonorphans. In 
that sense, it is not only desirable to eliminate orphans, 
but also it should be done in a timely manner. 

There have been many algorithms for orphan eiim- 
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illation in a distributed transaction environment: ea- 
ger algorithms [Herlihy 89, McKendry 86] which use 
real-time clocks to ensure that orphans are eliminated 
within a fixed duration, iazy algorithnis [Herlihy 89, 
McKendry 86] which use logical clocks to ensure that 
orphans are eventually eliminated as information prop- 
agates through the system, and a topological change al- 
gorithm suitable for dealing with orphans resulting from 
network partitioning [Mueller 83] in a distributed trans- 
action system. Mueller's algorithm is driven by both the 
transatction home sites and the trauisaction synchroniza- 
tion sites (TSS) for the different, file^ opened from the 
transaction. 

Outside the transaction domain, the orphan eHmina- 
tion problem was first identified by Birrell [Birrell 83], 
and solutions based on timeouts have been proposed by 
Lampson [Lampson 81] and Rajdoot [Panzieri 88]. De- 
pending on the RPC semantics in a distributed system, 
additional requirements might be needed for correctness 
criteria. In particular, for at most once calls, not only 
should orphans be aborted, the abortion must include 
undoing of any side effects that may have been pro- 
duced [Panzieri 88]. Walker [Walker 84] has proposed a 
transaction-based orphan elimination scheme that dy- 
namically tracks dependencies among transactions. His 
scheme uses timeouts to keep the information sent in 
me^ssages to a manageable level. 

In a conventional transaction system, orphans have no 
impact on data consistency, where a transaction does 
not interact with the outside world until it commits. 
However, in a general purpo.se distributed system, such 
inconsistencies may be more problematic. For exam- 
ple, the Argus system [Liskov 83] supports a method- 
ology in which user-defined atomic data types are im- 
plemented by a mixture of atomic and non atomic data 
types at a lower level. In the absence of an orphan 
elimination scheme, the implementot of such a type is 



responsible for ensuring that transient inconsistencies in 
the atomic components do not produce permanent in- 
consistencies in the nonatomic components. An orphan 
elimination scheme based on Walker's method has been 
implemented as part of the Argus system [Liskov 87] to 
deal with that issue. 

This paper addresses orphan detection and elimina- 
tion for applications in a distributed object-oriented sys- 
tems. We apply our model for a distributed Unix/C+H- 
environment, but the model is general and can be ap- 
plied to other distributed object-oriented systems. The 
paper is organized as follows. Section 2 describes the 
background for this work and gives definitions for the 
terminology used in the paper. Section 3 presents the 
problem definition. Section 4 discusses our model. Sec- 
tion 5 uses a simple example in a distributed Unix/C+-H 
environment to describe our model in more detail, and 
we conclude by a summaxy in Section 6. 

2 Background and terminology 

In this section, we define some tornxs used by our model 
in this paper. 

2.1 Implementation set 

An implementation set is an object with a well defined 
interface representing the traditional Unix process ab- 
straction in an object-oriented system environment. An 
implementation set contains many objects (persistent 
and non-persistent), and exports operations to admin- 
ister these objects as well as control migrating objects 
between different implementation sets. Implementation 
sets are uniquely identified in the system by a unique 
IS- id, i.e., corresponding to Unix process-id. 
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2.2 Distributed thread of control 

TYaditional threads (Cooper 88] will be referred to as 
physical threads in this paper as they have their own 
physical stacks. Typically, a physical thread is uniquely 
identified within a process by a thread-id. Distributed 
thread (dthread) represents the sequence (flow) of events 
or invocations by an application (one or more physical 
threads) in a distributed system, i.e., the dthread notion 
is a virtual thread of control and it does not own private 
physical stacks, rather it uses potentially many of the 
stacks belonging to existing physical threads. 

D threads are created upon invoking operations on ob- 
jects asynchronously, i.e., an application might have 
a tree of distributed threads. A distributed thread 
terminates when the method originally invoked asyn- 
chronously terminates. 

The notion of dthread is used to help in tracing events 
and cleaning orphans later. Each dthread has a unique 
identifier in the system (dthread-id) assigned at cre- 
ation time in a decentralized manner using the tuple 
<machine-id : IS-id : thread- id>^ 

2.3 Factories 

Factories are objects specialized in creating objects of a 
specific class, i.e., interface and implementation. They 
provide a uniform model for creating objects in a dis- 
tributed environment. This model allows the program- 
mer to create objects the same way independently of 
where they are created (i.e., in the local address space 
or in a different one). 

In traditional C-h+, the operator new() creates an 
instance of a class only in the same address space, Fac- 

^ In a Unix context, machine-id corresponds to a host-id, IS-id 
corresponds to a process-id, and thread- id is a unique identifier 
to a physical thread within that process. 



tories are responsible only for creating objects^ of a spe- 
cific class. The interface to a factory object has one 
method (createO), which creates instances of a partic- 
ular class in the same address space where the factory 
is. The createO method can be thought of as a wrap- 
ping to the newO operator in C-f +• Once the program- 
mer has a reference to a factory object, the programmer 
can create objects of that specific class by invoking the 
createO method on that factory independent of the 
factory location. 

2,4 Persistent objects 

Persistent objects are objects whose lifetime is indepen- 
dent of the lifetime of the thread which created them. 
On the other hand, non-persistent object's lifetime is 
tied to the lifetime of the thread which created them, for 
example, in traditional Unix/C+H- environment objects 
are deleted automatically when the process terminates. 

3 Problem statement 

The current memory management model for C-I--I- is 
to reclaim memory for all objects in a given process 
(address space) upon process termination. This model 
would not apply for persistent objects as by definition a 
persistent object's lifetime is independent of the lifetime 
of the dthread which created it. 

On the other hand, applying the current C++ mem- 
ory management model to non-persistent objects is a 
much more difficult problem and needs to be resolved. 
For example, consider a dthread that creates a remote 
non-persistent object and then terminates. This remote 
object becomes an orphan and needs to be reclaimed. 
Next section presents a lazy scheme where orphans are 

^Notice that factories only create objects and do m>t delete or 
manage them. Deleting an object will be dealt with in this paper 
in a later section. 
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guaranteed to be eliminated at the termination of the 
dthread which created them. 

4 The model 

The issue of managing the application objects* storage 
is fortunately limited only to two scenarios: an applica- 
tion object created as a result of remote invocation on 
a factory object, and an object migrating from one im- 
plementation set to another. We envision one approach 
for dealing with both scenarios and in our discussions 
we will focus only on the factory invocation scenario. 

To administer globally the storage of distributed ap- 
plication objects, we are adding two objects per imple- 
mentation set to help in performing this administrative 
function. These two objects are either created trans- 
parently as part of initializing new implementation sets 
or explicitly by requiring applications to call a library 
function from main(). Each of the two objects has a 
database as part of their state representing the infor- 
mation about a subset of the objects it\ the system, and 
each object has a set of methods to query their database. 
These two objects enable tracking where d threads had 
created objects and how to get to them. The operation 
of how they are used is discussed in detail next section 
through a simple example. The following is a descrip- 
tion of these two objects and their interfaces' methods 
description. 

1. Local directory (LD) object; this is an ob- 
ject per implementation set. The object's state 
contains information about all local objects in 
this implementation set which are created in re- 
sponse to requests from other implementation sets. 
An entry in this object's database is represented 
as a tuple <dthread-id : IS-id : GID : local 
object reference>. Figure 1 represents the lo- 



cal dirtctoTy interface declaration. Referring to 
figure 3, an entry in the local directory object's 
database m IS2 is described below: 

• Dthread- id: is the identification of the dthread 
(dthread 1) which created this object [ohjtcil) 
in this implementation set (IS2). 

• IS-id: is the implementation set (ISl) from 
which the dthread initiated the creation of this 
object (object 1). 

• GID: is the object (objectl) global unique-id. 

• Local object reference: is a reference to the 
local object (objectl) in this implementation 
set (IS2). 

The add-entryO method inserts a tuple in the lo- 
cal directory 's state about how a local object in this 
implementation set was created. 

The different flavors of the delete -entry () 
method delete the appropriate local object(s) in 
this implementation set and clean their correspond- 
ing entries in this local directory object. 

2. Remote directory (RD) object: this is an ob- 
ject per implementation set. The object's state 
contains information about remotely created ob- 
jects from this implementation set. An entry 
in this object's database is reprinted as a tu- 
ple <dthread-id : GID : remote LD reference 
: remote object reierence>. Figure 2 repre- 
sents the remoie directory interface declaration. 
Referring to figure 3, an entry in the remote di- 
rectory object's database in ISl is described below: 

• Dthread-id: is the identification of the dthread 
(dthreadl) which created the remote object 
(objectl). 

• GID: is the object (objectl) global unique-id. 
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interface local .directory = 

add_entry(dthread_id ID, IS.id IS, gid_t GID, ref.t LCL.OBJ.REF) , 

delete_entry(ref.t LCL.OBJ.REF) , 

delete„entry(gid^t GID), 

delet e_ entry (dthread.id ID, IS„id IS); 

Figure 1: A local directory interface declaration 

interlace remote.directory = 

add. entry (dthread.id ID, gid.t GID. zet^t RMT.LD.REF, rel.t RHT.OBJ^REF) . 
delete.entryCref .t RMT.OBJ.REF) , 
delete, entries (ref_t RMT.LD.REF) , 
checJc(dthread_id ID); 

Figure 2: A remote directory interface declaration 



• Remote LD reference; is a remote reference to 
the local directory object in the implementa- 
tion set where objectl is (i.e., LD2). 

• Remote object reference: is a remote reference 
to the actual object (i.e., objectl) in IS2. 

The add-entryO method inserts a tuple in the re- 
mote directory's state about an object created re- 
motely from this implementation set. 

Tile delet€_entry() method deletes a remote ob- 
ject and cleans tlie appropriate directory entries. 

The delete.entriesC) method deletes tdl entries 
in this remote directory object for objects created 
from this implementation set in a specific remote 
implementation set. 

The check 0 method returns an entry from the re- 
mote directory object corresponding to a specific 
dthread-id. 

A new two library functions are supplied as part of the 
system, one is to eliminate orphan objects on d thread 
terniination (terminate ()) and the otlier is used for ex- 
phcit object deletion (killC)). These library functions 
will invoke operations on the appropriate directory ob- 
jects to track where d threads created objects and how 



to get to them to delete them. Access to these two 
directory objects is trswisparent to the application pro- 
grammer (i.e., the application need not to do explicit 
operations on these directory objects) and they are only 
accessed from system functions. The following are three 
scenarios illustrating when these directory objects are 
accessed: 

• On creating a remote object, the factory's 
create () method initiaUzes an entry in the local 
directory object for the implementation set where 
the factory is. Back on the client side, the create () 
method will initiaUze an entry in the remote di- 
rectory object before returning a reference to the 
application. 

• On dthread termination, the terminate () library 
function invokes operations on the appropriate di- 
rectory objects to track and delete orphans, and 
then clean the appropriate entries from these di- 
rectory objects* state. 

• On explicit object deletion, the killO library 
function invokes operations on the appropriate di- 
rectory objects to track and delete the object (i.e., 
if the object is remote), and to clean the appropri- 
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ate entries from the directory objects' state. 

5 Example 

To illustrate how an implemented version of our model 
will work in a distributed Unix/C++ environment, we 
use a simple example (figure 3) where, dthreadl created 
in ISl creates (object 1) in IS2, and then terminates. In 
this simple example, we assume that the application 
has already a reference to factory J. The example will 
discuss in detail how remote objects are created, how 
orphans are eliminated at d thread termination, and how 
objects are deleted explicitly, 

5.1 Remote object creation 

Objects are created by invoking the createO method 
on the appropriate factory. In this example (see figure 
3) J dthreadl in ISl invokes the createO method on 
factory 1 in IS2 and as a result, object! is created in 
IS2, and a reference to the remote object is returned to 
dthreadl. In order to keep track of remote objects, ap- 
propriate entries in the directory objects are initialized 
as follows: The following steps occur transparently to 
the application: 

• The factory in IS2 after creating the desired object 
(object 1) will invoke the add.entryO operation on 
LD2 object. This will add an entry in the local di- 
rectory object's database in IS2 <dthreadl : ISl 
: GID : *objectl>. 

• In the RPC message returning the reference to ob- 
ject 1, the factory will add a reference to LD2 in the 
message, 

• On crossing machine boundary to ISl, An 
add_entry() operation is invoked on the remote 
directory object in ISl. This will add an entry 



in the remote directory object s database in LSI 
<dttvreadl : GID : *LD2 : *objectl>. 

• A reference to the remote object object 1 is re- 
turned to the applications. 

5.2 Dthread termination 

In the above example, upon dthreadl termination in 
ISl, the tenninateO library function is called to elim- 
inate all non-persistent objects created by dthreadl 
in the system which otherwise become orphans. The 
terminateO function checks (see figure 4) if there are 
entries in the remote directory object (RDl) belonging 
to dthreadl. If no entries are found, then dthreadl ter- 
minates and we are done. 

For every entry found in the RDl object belonging 
to dthreadl, the check (ID) operation returns a refer- 
ence (RMTXD JIEF) corresponding to a local directory ob- 
ject (i.e., LD2). The terminateO function invokes the 
delete -entry (GID) method on the local directory ob- 
ject in the remote implementation set (IS2). This will 
result in deleting objectl and the corresponding entry 
from the local directory object in IS2. With no en- 
tries in the remote directory object in IS2 correspond- 
ing to dthreadl, the delete_entry() operation is com- 
pleted. Finally, the terminateO function invokes the 
delete-entriesO method on the remote directory ob- 
ject in ISl to clean all entries with RMTXD-REF field 
equal to the returned reference from the checkO oper- 
ation (i.e., LD2). 

Notice that the delete.entryO operation on the lo- 
cal directory object after completing its work in IS2, it 
checks if there are entries in the remote directory object 
at IS2 belonging to dthreadl. If there are entries, the 
delete.entryO is invoked on the new local directory 
object as we discussed earlier. This will take care of 
nested object creation across multiple implementation 
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local directory object (LDl) 



local directory object (LD2) 



remote directory 
object (RD2) 




ISl 



IS2 



Figure 3: Creating a remote object and orphan reclaimation 
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tenninate(dttiread,id ID) 
begin 

ref.t RMT_LD.REF; 

whileCRMT.LD.REF := remote_directory->check(ID)){ 
RMT.LD.REF->delete. entry (ID. IS); 
remote.directory->delete_entries(RHT_LD REF) ; 

} 

end; 



Figure 4: The terminate library function 



sets in a distributed system. 

5.3 Explicit object deletion 

Factories are used to provide a uniform method of creat- 
ing objects in a distributed system. They ate responsi- 
ble for creating objects and not for managing or deleting 
them. This is because we believe that the programmer 
should not have to remember the factory reference in 
order to be able to delete an object. Instead, we intro- 
duce a new killO function to replace the delete () 
function in G+-I-. 

If the object to be deleted is remote (i .e. , 
dthreadl in ISI is deleting object I in 1S2, the 
kill(obj_rel) function (see figures 3, 5) invokes the 
delete_entry(obj jref ) method on the remote direc- 
tory (RDl) in this implementation set (ISI). This in 
turn invokes the delete_entry(GID) method on the ap- 
propriate remote LD object (i.e., LD2). The net result 
is deleting the remote object (objectl) and cleaning the 
appropriate entries in the directory objects, i.e., an en- 
try in the local directory object in IS2 and an entry in 
the remote directory object in ISI. 

On the other hand, if the object is local (e.g., 
dthreadl in IS2 is deleting objectl and objjref is a ref- 
erence to objectl), the kill(obj-ref ) function invokes 
the delete-entry(obj_ref ) method on the local direc- 



tory object in this implementation set (i.e., 1S2). The 
net result is deleting the object (objectl) and cleaning 
the appropriate entry in the local directory object (i.e., 
IS2). 

6 Conclusion 

The orphan detection and elimination issue has been 
investigated extensively in the distributed transaction 
context as well as in the distributed system context 
within the RPC framework. In this paper we ex- 
tended this investigation to a general distributed objects 
oriented application environment. Orphans in a dis- 
tributed object-oriented application environment are 
objects which are created by the appUcation and con- 
tinue to exist beyond the application's lifetime. Orphan 
objects in a distributed object-oriented application en- 
vironment correspond to activities which continue to ex- 
ecute on behalf of aborted transactions in a distributed 
transaction system. 

We presented a simple and a general model for de- 
tecting aaid ehminating orphans in a distributed object- 
oriented application environment. Also, we presented a 
simple example to illustrate how our model would work 
in a distributed Unix/C-f-l- application environment. 
An implementation of our model should not add much 
overhead; we are planning to implement this model in 
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void* kilKref.t obj.ref) 
begin 

if (obj„rel->is_remote()) 

r emot e„directory->delet ©.entry ( ob j _r ef ) ; 

else 

local _directory->delete.entry(obj_r el) ; 



end; 



Figure 5: The kill function 



the near future as soon as the necessary platform envi- 
ronment is completed. 
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